home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Libris Britannia 4
/
science library(b).zip
/
science library(b)
/
PROGRAMM
/
BASIC
/
0006.ZIP
/
BASICREF.DOC
< prev
next >
Wrap
Text File
|
1983-08-17
|
6KB
|
106 lines
CROSS-REFERENCE UTILITY FOR IBM PC BASIC PROGRAMS
To produce a cross-reference listing for BASIC programs running on the IBM
Personal Computer, I wrote a program that scans a BASIC program file and
builds a list of variable names and the locations where they occur. The
program then sorts that list and writes it to a file.
The program expects a standard BASIC program file, that is, one saved without
the special A (ASCII - American National Standard Code for Information
Interchange) or P (protect) options. The standard save procedure stores the
program in a tokenized format in which all reserved BASIC words are
represented by tokens, 1 or 2-byte codes. For instance, the RANDOMIZE
statement is represented by a single ASCII value of 185. This tokenized
format saves space because multiple-character reserved words are represented
by only one or two characters.
All tokenized characters have a value of 128 or greater, outside the range of
ASCII values legal in variable names. Only capital letters, numerals, and the
period are legal in variable names, and these have values between 46 and 90.
(Variables can be entered in lowercase, but BASIC converts them to capitals.)
The restrictions on legal variable names simplify the work of the cross
reference program because it can usually just skip tokens in its search for
valid variable names. Two exceptions are the tokens for a remark (ASCII 143)
or data statement (ASCII 132). In both these cases the program skips to the
end of the line so as not to confuse words in remarks, or literals in data
statements, with valid variable names.
All numbers used in a program - constants, initial values, line-number
references, and so on - are also encoded. For instance, an ASCII 28 code
indicates that an integer value follows in the next 2 bytes in the file. An
ASCII 29 indicates a single - precision number in the next 4 bytes. Other
prefixes indicate various types of double - precision numbers, octal numbers,
or hexadecimal numbers.
The program skips over all coded numbers except those prefixed by an ASCII 14.
This code signifies a 2-byte number that is a program line number reference,
following a GOTO or GOSUB, for instance. The cross-reference listing program
treats line number references as labels and lists all lines referenced by
other lines. This can help you find all the places in a program that call a
certain subroutine.
H O W I T W O R K S
The format of the lines in a tokenized program file is shown in figure 1. The
first 2 bytes are the BASIC offset address to the next program line. Our only
interest in it is when it is 0 because a 0 offset signifies the end of the
program. The next 2 bytes contain the line number, with the least significant
byte first. These are followed by a series of bytes, including tokens, coded
numbers, and variable names, up to the end of the line, indicated by an ASCII
value of 0.
The approach of the cross reference utility, then, is very simple. It makes a
note of the line number being scanned at the moment, then skips over tokens
and encoded numbers, looking for variable names and references to other
program lines. When it finds the beginning of a variable name, it builds the
name, character by character, until it comes to an ASCII code that can't be
part of a variable name. If the variable has been explicitly typed (marked by
a $, #, !, %), that character is added to the end of the name. If the
variable is subscripted, then "(SUB)" is added. Once complete, the variable
name is stored in an array; the line number where it appears is stored in a
parallel array of line numbers.
Once the entire program file has been scanned, the label and line number
arrays are sorted using a Shell sort. Then they are written to a disk file.
The only real problem is that all the scanning and sorting takes time. The
program took nearly 7 minutes to process and sort labels for its own 145 lines
and 245 label references. For a smaller program (123 references and 133
labels), it required 3 minutes 45 seconds. You can get a modest increase in
speed of about 5 to 10 percent by eliminating comments and consolidating
statements in to one line where possible. This will have the greatest effect
in the WHILE loop beginning at line 600 and the sort routine beginning at line
800.
B E W A R Y
In order not to slow the program down further, I kept it as simple as
possible. Because of this, a few bogus variables may creep into your listing.
These are words used as part of BASIC statements that are not tokenized. They
include the following: ALL in a CHAIN statement, BASE in an OPTION BASE
statement, B or BF in a LINE statement, R in a LOAD statement, AS in APPEND,
or OUTPUT in an OPEN statement. None of these is a reserved word, and they
are therefore not tokenized. Thus, if you use them in a program, the cross
reference utility will treat them as variable names. Note that both AS and
OUTPUT appear in listing 2.
The cross reference listing is written to a sequential disk file, which may be
read later. The file name for the listing is the file name of the program
file with an extension of CRF. If the original program file were MYPROG.BAS,
the listing would appear on file MYPROG.CRF. To display the listing on your
monitor, you first need access to the DOS (disk operating system) - execute
SYSTEM from BASIC. When in DOS, execute TYPE MYPROG.CRF. If you want a hard
copy, press the Ctrl and PrtSc keys simultaneously prior to execution the TYPE
command; the listing on the monitor will then be output to the printer.
M O D I F I C A T I O N S
The output is formatted for an 80 - column screen or printer as the program
appears in listing 1. To format for a 40 - column screen, change N>8 in line
3070 to N>3. To format for a 132 - column printer width, change it to N>16.
You may also want to redimension the arrays in statement 110. They are large
enough for modest programs, but larger programs with more references will need
more space.
tement 110. They are large
enough for modest programs, but larger programs with more referen